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Increasingly complex applications involve large datasets in combination with non- 
linear and high dimensional mathematical models. In this context, statistical infer- 
ence is a challenging issue that calls for pragmatic approaches that take advantage 
of both Bayesian and frequentist methods. The elegance of Bayesian methodology is 
founded in the propagation of information content provided by experimental data 
and prior assumptions to the posterior probability distribution of model predic- 
tions. However, for complex applications experimental data and prior assumptions 
potentially constrain the posterior probability distribution insufficiently. In these 
situations Bayesian Markov chain Monte Carlo sampling can be infeasible. From a 
frequentist point of view insufficient experimental data and prior assumptions can 
be interpreted as non-identifiability. The profile likelihood approach offers to detect 
and to resolve non-identifiability by experimental design iteratively. Therefore, it 
allows one to better constrain the posterior probability distribution until Markov 
chain Monte Carlo sampling can be used securely. Using an application from cell 
biology we compare both methods and show that a successive application of both 
methods facilitates a realistic assessment of uncertainty in model predictions. 

Keywords: identifiability, profile likelihood, Bayesian Markov chain Monte 
Carlo sampling, posterior propriety, propagation of uncertainty, prediction 
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1. Introduction 

In many scientific situations mathematical models are used to predict properties 
of a system under study using explicitly formulated assumptions and hypotheses. 
This is especially important for complex applications that do not allow for an 
intuitive understanding of the processes involved. Applications range from analyses 
in particle physics [1] or in climate research [2, 3] to quantifying dynamical processes 
in cell biology [4]. Often, these mathematical models contain parameters that are 
unknown or known only with large uncertainty. For example in bio-chemical models 
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that are utilised in cell biology, parameters such as reaction rate constants, amount 
of molecular compounds, detection sensitivities or measurement backgrounds are 
often unknown. Before a model can be used for prediction reliably, the unknown 
parameters have to be estimated by comparing model output to experimental data. 

For a realistic assessment of the accuracy of model predictions, it is important 
that uncertainties in the experimental data and in prior assumptions are propa- 
gated correctly via the parameters to the desired predictions. Bayesian Markov 
chain Monte Carlo (MCMC) sampling facilitates this propagation of uncertainties 
by sampling from the posterior probability distribution [5]. However, if experimen- 
tal data is limited and the mathematical models are non-linear and contain many 
unknown parameters the posterior probability distribution can be insufficiently con- 
strained. For such insufficiently constrained posterior the probability mass can be 
distributed widely in a high dimensional parameter space. Consequently, MCMC 
sampling can quickly become infeasible. 

Alternatively, one can resort to frequentist methods in this situation. Here, 
insufficient experimental data and prior assumptions can be interpreted as non- 
identifiability of the model parameters [6J. We applied a generic approach that 
uses the profile likelihood to detect both structural and practical non-identifiability 
[7]. Furthermore, this approach allows one to design new experiments that resolve 
non-identifiability. Therefore, it is beneficial to further constrain the posterior prob- 
ability distribution until MCMC sampling can be applied reliably and efficiently. 

We compared the results of both MCMC sampling and profile likelihood meth- 
ods. In the absence of non-identifiability the results of both methods are in good 
agreement. However, in the presence of non-identifiability their results can be sub- 
stantially different. Our results imply that MCMC sampling in the presence of 
non-identifiability can be misleading. Therefore, we suggest a successive applica- 
tion of both methods that ensures a realistic assessment of uncertainty in model 
predictions. 

(a) Frequentist methods 

Maximum likelihood estimation of model parameters is a theoretically well de- 
veloped area [8]. Based on an assumption on the distribution of the measurement 
noise, the likelihood function L(y\9) of the data y given the parameters 6 describes 
the agreement of model output and experimental data. In case of normally dis- 
tributed measurement noise e ~ N(0,<r 2 ) the likelihood reads as 

™-nn^-(-K a ^)') 

where m model outputs yk(ti, 0) and dk data instances for each model output such 
as time points t\ can be considered. The maximum of L(y\0), i.e. the best fit of the 
model to the data, provides a point estimate 6 of the parameters. This maximum 
likelihood estimate (MLE) can be calculated for non-linear models by numerical 
optimisation methods, see e.g. the trust region algorithm in [9] and for a general 
introduction in [10, 11]. The uncertainty of the estimate 6 is buried in the shape 
of the likelihood function. Figure 1 shows an illustration of the likelihood for three 
typical cases. 
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If the amount of model quantities that can be accessed by experiments is limited, 
a subset of parameters can be structurally non-identifiable. This indicates that the 
parametrisation of the model is such that two or more parameters can compensate 
their effects and yield exactly the same model outputs yk(ti,0). This in turn re- 
sults in a constant likelihood value on a sub-manifold, see figure 1A. Consequently, 
the MLE for the parameters cannot be determined uniquely. The parameter rela- 
tions that cause the structural non-identifiability are akin to gauge invariances in 
physical theories. However, for complex models it can be difficult to detect struc- 
tural non-identifiability. For example, if models can only be evaluated by numerical 
simulation, such as in the case of detector models used in particle physics [12] or 
dynamical models that are used in cell biology [4], structural non-identifiability 
cannot be detected directly. In the latter case, methods for a priori structural non- 
identifiability analysis exist that analyse the structure of the ordinary differential 
equations (ODE) without having an analytical solution. A comparison of these 
methods can be found in [13]. However, a priori methods are often limited to lin- 
ear ODE systems or are impractical for models containing many parameters [14]. 
Arguably one of the the most practical a priori methods that has also been proved 
to work for larger models applies a probabilistic algorithm [15, 16]. 

In addition to structural non-identifiability, model parameter can be practically 
non-identifiable [7] . This type of non-identifiability arises if the amount and quality 
of experimental data is limited. It cannot be detected by a priori methods. However, 
practical non-identifiability is of equal importance. A generic approach that allows 
one to detect both structural and practical non-identifiability at the same time 
uses the concept of the profile likelihood [7, 17]. The profile likelihood PL can be 
calculated for each parameter $i individually by 

PL(y\O i )=m BX .[L(y\0)]. (1.2) 

The equation indicates that for each value of 9i all of the remaining parameters 
9j are re-optimised, see figure 1 for illustration. The profiles PL(y\6i) break down 
the uncertainty contained in the high-dimensional likelihood L(y\6) to a footprint 
in one dimension. It allows for reliable conclusions, about whether a parameter 
can be inferred from the experimental data. Three typical cases arise and can be 
detected from the profiles. A flat profile with a constant value indicates structural 
non-identifiability (c.f. figure 1A). A profile that decreases but tails out to a plateau 
to one or both sides indicates practical non-identifiability (c.f. figure IB). A profile 
that tails out to zero on both sides quickly enough, i.e. at least exponentially fast, 
indicates structural and practical identifiability (c.f. figure 1C). Experimental design 
and model reduction strategies based on the profile likelihood allows one to resolve 
parameter non-identifiabilities iteratively, for an application see in [18]. 

Furthermore, the profile likelihood allows one to assess likelihood based confi- 
dence intervals [20-22]. A confidence interval [a~ , af\ to a confidence level a = 0.95 
signifies that the true value of the parameter 6* is expected to be inside the interval 
with 95% probability. Using a threshold A a = Q(Xdp Q ) which is the a quantile 
of the x 2 ~distribution with df degrees of freedom [10] , confidence intervals can be 
determined from the profiles by {9 l - 2 ■ \og(PL(y\0i) / L(y\§)) < A a }. 



Article to appear in Phil. Trans. Roy. Soc. A 



1 



A. Raue and others 



A structural non-identifiability Q practical non-identifiability Q both parameters identifiable 




Figure 1. Identifiability analysis using likelihood profiles: The upper panels show as illus- 
tration the shape of the likelihood L(y\8) in two dimensions for three typical cases. The 
traces of the profiles in parameter space are indicated as red lines. The lower panels shows 
the respective profile likelihood PL(y\6i) for the dimension of parameter 8\ as red lines. 
In all panels the asterisks denote the MLE in cases where a unique solution exists and 
the dashed lines denote the threshold A Q that yields a likelihood based confidence region 
[10, 19]. Three typical cases can arise: (A) A flat profile indicates structural non-iden- 
tifiability. In this case, no unique solution for MLE exist. (B) A profile that decreases but 
tails out to a plateau to one or both sides indicates practical non-identifiability. (C) A 
profile that tails out to zero on both sides quickly enough, i.e. at least exponentially fast, 
indicates structural and practical identifiability. The confidence interval of 9x, for (A) is 
infinite, for (B) has only a lower bound, for (C) has a finite range. 

(6) Bayesian methods 

By applying Bayes' theorem the likelihood function (1.1) is extended by the 
prior probability density function (PDF) of the parameters P(9) and normalised 
by a factor c yielding the posterior PDF of the parameters 

P(9\y) = c-L(y\9)-P(9). (1.3) 

In analogy to the MLE, the maximum a posteriori (MAP) estimate is defined by 
maximising P(9\y). The only decisive difference to frequentist methodology is the 
choice of the prior P(9). The direct computation of the normalisation factor c is not 
feasible for a high dimensional parameter space. However, extensive Markov chain 
Monte Carlo (MCMC) sampling offers a way to evaluate P(6\y) despite unknown c. 
This intriguing feature opens the prospect of considering the full high dimensional 
posterior PDF for statistical inference. Most prominently, the Metropolis-Hastings 
algorithm [23, 24] defines a Markov process where transitions 9 — > 6' are generated 
using a proposal function q(9\9') that eventually produces a series of samples of the 
posterior PDF P(6\y). The transitions are accepted with probability 

a(9, 9') = min[l, (L(y\9')/L(y\9)) ■ (q(9\9')/q(9'\9))}. (1.4) 
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For efficiency of the sampling the choice of the proposal function q(9\6') is impor- 
tant. Often, proposals drawn from a multivariate normal distribution are conve- 
nient. One of the most simple implementations uses q(9\9 r ) ~ N(0, s ■ I) where s ■ I 
is a scaled identity matrix (SIM). Too small a choice of s in relation to the actual 
shape of the posterior PDF will cause the process to converge slowly and to pro- 
duce correlated samples. Too large a choice of s will lead to rejection of too many 
proposals 0' yielding a slow sampling. Assuming that the posterior PDF is a multi- 
variate normal distribution the optimal acceptance rate of proposals is « 0.23 [25]. 
Nevertheless, in the light of complex and non-linear models with possibly limited 
amount and accuracy of experimental data, this assumption is problematic because 
the shape of the posterior PDF can be far from the PDF of a normal distribution. 
For a high dimensional parameter space computational efficiency also becomes an 
important issue. In these cases, the Markov chain may have to move along complex 
structures [19, figure 8.2.2]. To increase efficiency more sophisticated methods take 
into account the natural geometry of the posterior PDF, e.g. the manifold Metropo- 
lis adjusted Langevin algorithm (MMALA) takes into account the local gradient 
and curvature information [26]. 

Before applying an MCMC sampling method, the prior PDF of the parameters 
needs to be specified. Given that empirical evidence exists about the distribution 
of a parameter, such as a previous measurements or estimation, P{9) should in- 
corporate this prior knowledge accordingly [27]. If no empirical evidence about the 
parameter value is available, the prior should be chosen as uninformative [19] . This 
requires a flat metric in parameter space, i.e. that does not artificially favour certain 
parameter values. Depending on the parametrisation of the model it can in practice 
be difficult to obtain a flat metric and hence to specify an uninformative prior [27] . 

The most crucial problem that MCMC sampling faces is to ensure that the 
samples obtained realistically represent the actual posterior PDF. One instance 
where the convergence of the Markov chain fails is if the posterior PDF is not proper 
[28]. Posterior PDF's are called proper if they are integrable [19]. This means that it 
has to tail out to zero sufficiently fast over the appreciable range of the parameters 
such that its integral can be normalised to one. Non-identifiable parameters cause 
the posterior PDF to be non-proper [28]. This indicates that neither the prior 
assumptions nor the likelihood that represents the experimental data constrain the 
posterior PDF sufficiently. A practical consequence is that the Markov chain cannot 
converge and hence gives inaccurate results [29]. For convergence it is required 
that the Markov chain is positive recurrent which is not given in the case of non- 
identifiability [30] . The user must ensure parameter identifiability before an MCMC 
technique can be used securely. If models contain many unknown parameters and 
the posterior PDF is not sufficiently constrained the convergence of the Markov 
chain can be impractically slow even for a proper posterior PDF. 

2. Results 

For reliable inference in the presence of non-identifiability, we propose a joint ap- 
proach that takes advantage of both profiling and MCMC sampling methods. The 
profile likelihood is suitable to detect parameter non-identifiability [7] . Furthermore, 
it allows for experimental design that helps to resolve parameter non-identifiability 
[18]. This ensures that the posterior PDF is proper and well constrained. Subse- 
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A workflow of the joint approach 
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Figure 2. (A) Workflow of the joint approach that combines profile likelihood and MCMC 
sampling methods: The first step is setting up an appropriate mathematical model (B) 
that contains the available prior knowledge and explains the experimental data sufficiently 
well (C). The profile likelihood approach allows one to detect and to resolve non-identi- 
fiability by experimental design. Therefore, it helps to ensure that the posterior PDF is 
proper and sufficiently well constrained. Subsequently, MCMC sampling can be applied 
securely. Finally, the uncertainty in model predictions can be assessed realistically using 
the obtained samples. (B) The figure shows the structure of the ODE model describing 
Epo and EpoR interactions. The dashed line indicates the cell membrane, the part above 
is the outside, the part below is the inside of the cell. (C) The agreement of experimen- 
tal data and model output for MAP parameter values for both experimental setups is 
displayed. The extended setup was derived by experimental design considerations, see in 
[18]. 



quently, efficient MCMC sampling [26] can be used reliably to generate samples of 
the posterior PDF. Finally, the uncertainty in model predictions can be assessed 
realistically. The workflow of this joint approach is displayed in figure 2A. 

Both profiling and MCMC sampling methods separately are used for inference 
in various fields, e.g. in particle physics [1, 31] or in climate research [2, 3]. Here, we 
present results of using the joint approach that takes advantage of both methods 
for an application from cell biology [32]. An ordinary differential equation (ODE) 
model was used to describe the concentration dynamics of six molecular compounds 
involved in the interplay of the hormone erythropoietin (Epo) and its corresponding 
membrane receptor (EpoR), see figure 2B. Epo is an important factor in the differ- 
entiation of blood cells. The dynamics of the molecular compounds are formulated 
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as 

dx(t,6)/dt = f(x(t,9),9) (2.1) 

and a mapping of the dynamics to experimentally accessible quantities by the model 
output function 

y(t,6)=g(x(t,0),0) (2.2) 

that enters the likelihood function (1.1) was used. The model comprises six ODEs 
and ten unknown parameters including one nuisance parameter, for details about 
the model equations see the supplementary notes. All parameters are positive by 
definition, hence the logarithmic space yields a flat metric [19]. Since no empirical 
evidence about the values of the nine parameters of interest was available the prior 
was chosen uninformative. 

Radioactively labeled Epo facilitated the measurement of Epo concentration 
in the extracellular medium and of Epo bound to the cell membrane, see supple- 
mentary table 1. The MAP estimates of the model parameters were obtained by 
numerical optimisation, see supplementary table 2. The agreement of model out- 
puts and experimental data for this initial experimental setup is shown in figure 2C. 
In accordance with the experimental data the model describes the binding of Epo 
to the Epo receptor on the membrane, internalisation and recycling of the Epo- 
EpoR complex. As first step of the proposed joint approach the posterior PDF was 
screened for non-identifiability using the profiling method. 

(a) Identifiability analysis using posterior profiles 

The profile likelihood approach for identifiability analysis [7] can be translated 
to a profile posterior approach. In analogy to equation (1.2) profiling can be applied 
to the unnormalised posterior PDF by defining the profile posterior 

PP(6 i \y)=max[P(e\y)], (2.3) 

c.f. figure 1. Technical details on the implementation of the methods are given in the 
supplementary notes. For the initial experimental setup, the profiles of the posterior 
PDF reveal four structurally non-identifiable, two practically non-identifiable and 
three identifiable parameters. One of each case is displayed as red lines in figure 3A, 
the remaining profiles are shown in supplementary figure 1. 

(b) Markov chain Monte Carlo sampling 

The results of the posterior profiling approach indicates that the posterior PDF 
for the initial experimental setup is not proper, i.e. the posterior PDF cannot be 
normalised to one and is therefore not a valid PDF. In this situation an MCMC 
sampling cannot converge and hence gives inaccurate results [29]. In order to allow 
for a comparison between profiling and MCMC sampling, the prior PDF was re- 
stricted artificially to a uniform distribution with support from 10~ 5 to 10 +5 . The 
resulting posterior PDF is now proper and the Markov chain can in principle con- 
verge. It is important to note that the results of MCMC sampling can potentially 
be biased due to the artificial specification of this prior PDF. The new posterior 
PDF, though proper, still exhibits a complicated structure. The probability mass is 
distributed widely in the ten dimensional parameter space, see the posterior profiles 
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Figure 3. Comparison of profiling and sampling results: (A) For the initial experimental 
setup the posterior profiles indicated by red lines reveal that parameter k on is structurally 
non-identifiable, parameter kt is practically non-identifiable and parameter kde is identifi- 
able. The histograms display the marginalised MCMC samples obtained by the MMALA 
algorithm. For the identifiable parameter kd B both results of profiling and sampling agree 
quite well. Also for the structurally non-identifiable parameter k on the agreement is ac- 
ceptable. For the practically non-identifiable parameter kt the results are substantially 
different. The profile shows that the MAP point is located at log 10 (fct) w —1.8. However, 
the lion's share of the marginalised MCMC samples propose log 10 (fct) to be > 0. (B) 
Taking into account more experimental data the posterior profiles for the extended ex- 
perimental setup indicate that all parameters are now identifiable. The results of MCMC 
sampling and profiling are in good agreement. Interestingly, the MCMC samples for pa- 
rameter kt for the extended setup are localised close to the MAP point of the initial setup, 
note the different scales on the x-axis for (A) and (B). The dashed red lines indicates the 
threshold A Q that can be used to assess likelihood based confidence intervals. 



in figure 3A. In this situation, the SIM sampling algorithm, that uses a scale iden- 
tity matrix for generating proposals, was too inefficient and did not yield reasonable 
results within an acceptable time, see supplementary figures 1-3. To improve effi- 
ciency the MMALA algorithm that takes into account the local geometry of P(6\y) 
was used [26]. Figure 3A shows the histograms of marginalised MCMC samples. 
The remaining results are displayed in supplementary figures 4-6. For the identifi- 
able parameter fc<; e and for the structural non-identifiable parameter k on the results 
of the MCMC sampling are in good agreement with the results of the profiling. For 
the practically non-identifiable parameter k t the results are substantially different. 
The profile shows that the MAP point is located at log 10 (fc t ) w —1.8. However, the 
lion's share of the marginalised MCMC samples propose log 10 (fe t ) to be > 0. In this 
region the posterior profile reveals a plateau. 
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Figure 4. Propagation of parameter uncertainties to model predictions: The posterior 
samples that were acquired by MCMC sampling for the extended setup, see figure 3B. can 
now be used to assess prediction uncertainties accurately. The figure shows the predicted 
dynamics of three molecular components that could not be measured directly. The red 
line shows the prediction that corresponds to the MAP estimate of the parameters. The 
grey scale displays the posterior PDF of the prediction. 



(c) Taking into account additional experimental data. 

Both the results of posterior profiling and MCMC sampling indicate substantial 
uncertainty in the parameter estimates for the initial experimental setup. Based 
on the results of the profiling approach additional experiments were suggested, 
see in [18] for details. The target of the experimental design was to resolve non- 
identifiabilities. The additional experimental data were included in the estimation 
procedure, yielding an extended experimental setup, see figure 2C. Figure 3B shows 
the recomputed posterior profiles. They indicate, by tailing out to zero, that the 
identifiability problems were resolved for all parameters. The remaining profiles are 
shown in supplementary figure 7. 

As a consequence the posterior PDF is now proper without artificial assump- 
tions on the prior PDF. In this situation, the SIM sampling algorithm was still 
too inefficient and did not yield reasonable results within an acceptable time, see 
supplementary figures 7-9. To improve efficiency the MMALA algorithm that takes 
into account the local geometry of P{Q\y) was used again [26] . Figure 3B shows the 
histograms of marginalised MCMC samples. The results for the remaining parame- 
ters are shown in supplementary figures 10-12. For all parameters the results of the 
MCMC sampling and posterior profiling are now in good agreement. The MCMC 
samples for parameter k t that showed substantial difference between profiles and 
MCMC samples for the initial setup are now in good agreement as well. Interest- 
ingly, the MCMC samples for parameter kt for the extended setup are localised 
close to the MAP point of the initial setup. This suggests that the large probability 
mass of MCMC samples for values of log 10 (fc t ) > obtained for the initial setup 
was misleading. 

MCMC sampling results are now reliable and allow one to consider the full 
high dimensional posterior PDF for inference, see e.g. the non-linear correlation 
between parameter kdi and k^ e shown in supplementary figure 13. Using the gener- 
ated MCMC samples, the parameter uncertainties contained in the posterior PDF 
can be propagated accurately to the prediction of the dynamics of molecular com- 
pounds that are not accessible by experiments directly, see figure 4. 
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3. Summary 

We introduced a joint approach that combines frequentist profiling and efficient 
Bayesian MCMC sampling methods. The proposed approach uses the analysis of 
posterior profiles that helps to determine the identifiability of the model parameters. 
Non-identifiability results in a posterior distribution that is not proper, i.e. that 
cannot be normalised to one. However, a proper posterior distribution is required 
for convergence of the MCMC sampling. The results of the posterior profiling can 
be used for experimental design that helps resolve non-identifiabilities iteratively. 
Consequently, this ensures that the posterior distribution is proper. Having ensured 
identifiability of the parameters, the MCMC sampling results are reliable and can 
be used to propagate the uncertainty of parameter estimates to model predictions. 
Using an application from cell biology we showed that this approach enables one 
to obtain accurate results. 

We compared the results of posterior profiling and marginalising over the MCMC 
samples for two stages of experimental setup: an initial setup that contains non- 
identifiabilities and an extended setup where all parameters are identifiable. A uni- 
form prior distribution ensured that the posterior is proper despite non-identifia- 
bility for the initial setup. For the extended setup the results of posterior profiling 
and marginalising over the MCMC samples are in good agreement. For the ini- 
tial setup substantial difference between profiles and MCMC samples are observed. 
Interestingly, the profiles for the initial setup reflect the posterior distribution for 
the extended setup much better than the MCMC samples of the initial setup. This 
indicates that MCMC sampling results in the presence of non-identifiability can be 
inaccurate despite a proper posterior distribution. 
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1. Model description and experimental setup. 

The mathematical model presented in [1] describes the concentration dynamics 
of molecular compounds x involved in the interplay of the hormone erythropoietin 
(Epo) and its corresponding membrane receptor (EpoR). Epo is an important factor 
in the differentiation of blood cells. The dynamics of the molecular compounds are 
formulated as ordinary differential equations (ODE) dx(t,9)/dt = f(x(t,9),9) and 
a mapping of the compounds to experimentally accessible quantities by a function 
y(t, 9) = g(x(t, 9), 9) is used. The model comprises six dynamical variables and ten 
unknown parameters including one nuisance parameter. The dynamics are given by 
the ODE system 

d/dixi = -k on ■ xi ■ x 2 + k on ■ k D ■ x 3 + k ex ■ x 4 

d/ dt x 2 = -k on ■ x\ ■ x 2 + k on ■ k D ■ x 3 + k t ■ B max - k t ■ x 2 + k ex ■ x 4 

d/dt x s = +k on -x\-x 2 - k on ■ k D ■ x ? , - k e ■ x 3 

d/dt x 4 = +k e ■ x 3 - k ex ■ x 4 - k dl ■ x 4 - k de ■ x 4 

d/dtx 5 = +k di -x 4 

d/dt x 6 = +k de ■ x 4 

that describes the time evolution of the concentration of molecular compounds. The 
modelled compounds are: Epo in the extracellular medium (xi, Epo), Epo receptor 
on the cell membrane (x 2 , EpoR), occupied receptor complex on the cell membrane 
(X3, Epo_EpoR), internalized receptor complex (x 4 , Epo_EpoR_i), degraded Epo 
inside the cell (2:5, dEpo_i) and degraded Epo in the extracellular medium (xg, 
dEpo_e). It is assumed that two compounds have non zero initial value defined as 
unknown parameters: x 1 (t = 0) = Epo and x 2 (t — 0) = B max . The molecular 
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interaction that are considered are: ligand-independent EpoR endocytosis (fc t ), as- 
sociation of Epo and EpoR (k on ), dissociation of Epo and EpoR (k a ff = k on ■ kr>) 
with dissociation constant for EpoJEpoR (ko), ligand-induced EpoR endocytosis 
(fc e ), recycling of Epo and EpoR (k ex ), degradation of ligand-EpoR complex that 
remaining intracellular (fc^i) and is secreted extracellular (kde)- The experimentally 
accessible quantities are given by 

i/i = scale ■ (x\ + Xq) 

j/2 = scale ■ x s 

2/3 = scale ■ (x 4 + x 5 ) 

and describe measurements of radioactively labeled Epo in different compartments: 
in the extracellular medium (2/1, Epo_ext), on the cell membrane (2/2, Epo_mem) 
and inside the cell (2/3, Epoint), see table 1 for data. The mathematical model 
and the experimental data is available online from the original publication [1] in 
standard formats. 

For the analysis two experimental setups are distinguished. The initial ex- 
perimental setup includes only measurements of 2/1 and 2/2 and results in non- 
identifiability, see posterior profiles in figure 1. Based on the results of the profiling 
approach additional experiments were suggested, see in [2] for details. The target 
of the experimental design was to resolve non-identifiabilities. Additional measure- 
ments of 2/3 and independent direct measurements of B max , kp and EpoQ were 
included in the estimation procedure. The posterior profiles computed for the ex- 
tended setup confirm that the non-identifiabilities were resolved, see in figure 7. 
The MAP parameter values for the extended setup are given in table 2. 

For MCMC sampling both initial and extended setup have been analysed by 
the SIM and MMALA algorithms. The SIM algorithm uses a scaled identity matrix 
for generating proposals. To improve efficiency the MMALA algorithm that takes 
into account the local geometry of the posterior PDF [3]. 1) For the initial setup 
the results of the SIM algorithm are displayed in figures 1-3. They show that the 
SIM algorithm produced correlated results and did not converge yet. 2) For the 
initial setup the results of the MMALA algorithm are displayed in figure 4-6. The 
MMALA algorithm converged. 3) For the extended setup the results of the SIM 
algorithm are displayed in figures 7-9. They show that the SIM algorithm still 
produced correlated results and did not converge yet. 4) For the extended setup 
the results of the MMALA algorithm are displayed in figure 10-12. The MMALA 
algorithm converged. 

2. Implementation of the numerical methods. 

The ODE system was solved by the CVODES algorithm [4] . For numerical optimi- 
sation the trust-region method LSQNONLIN from MATLAB was used yielding the 
MAP estimates [5] . For efficiency it takes into account local gradient and curvature 
information. For the calculation of the posterior profiles as shown in figures 1, 4, 7 
and 10 an algorithm described in [6] can be used. 

Both MMALA and LSQNONLIN rely on the accuracy of first order sensitivities 
dy/dO. For ODE systems finite difference should not be used to calculate sensitivi- 
ties [7] . Therefore the sensitivity equations [8] are solved simultaneously by the ODE 
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solver CVODES. For the implementation of the MMALA algorithm, the simplified 
version was used that is computationally more efficient [3]. 

For the MCMC sampling, it is important to ensure that the Markov process 
starts in a high density region of P(8\y) either by performing a burn-in or by search- 
ing for the MAP point, the latter option was used here. To monitor convergence of 
the Markov chain the generated samples can be checked for independence, e.g. by 
computing their auto-correlation function (ACF), see figures 2, 5, 8 and 11. If the 
samples are correlated thinning can be used to increase independence. Correlation 
in the samples can be visualised directly by the Markov chains, see figures 3, 6, 9 
and 12. Each sampling run was aimed at obtaining 10 4 independent samples after 
applying thinning. For the SIM algorithm the 1/100 thinning still did not result 
in independent samples. The SIM algorithm turned out to be impractical both for 
the initial and extended setup. In contrast, the MMALA algorithm was much more 
efficient. A thinning of 1/100 and 1/10 for the initial and respectively the extended 
setup was sufficient to obtain 10 4 independent samples. 

For the propagation of the parameter uncertainties contained in the posterior 
PDF to the prediction of the dynamics of unobserved molecular compounds, all 
trajectories corresponding to the samples shown in figure 12 or 13 were evaluated. 
For each time point of the resulting set of curves a density estimate [9] was computed 
using the MATLAB function KSDENSITY. 

For the calculations MATLAB on a 2.1 GHz quad-core processor was used. The 
calculation of the profiles took w 5 minutes for the initial setup and w 1 minute of 
computation time for the extended setup. The generation of 10 6 MCMC samples 
using the MMALA algorithm took w 18.5 hours for the initial setup and « 20 
minutes of computation time for the extended setup. 
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time / min 


Epo_ext 


/ cpm 


Epo_mem 


/ cpm 


Epoint 


/ cpm 





321194 


±9461 










5 


277333 


±4984 


44698 


±1731 


9125 


±274 


20 


249910 


±3957 


44904 


±1123 


50698 


±346 


60 


231849 


±12291 


27209 


±2161 


83887 


±2311 


120 


222047 


±2145 


23076 


±1624 


102060 


±4124 


180 


235203 


±7075 


13680 


±1057 


97952 


±1493 


240 


251188 


±6437 


10024 


±1523 


82699 


±4096 


300 


260945 


±5200 


5447 


±436 


65967 


±3741 



Table 1. Experimental data: The experimentally accessible quantities are time-course mea- 
surements of radioactively labeled Epo collected in triplicates and units of counts per minute 
(cpm) over time in minutes (min) in different compartments: in the extracellular medium 
(yi, Epo-ext), on the cell membrane (yi, Epo-mem) and inside the cell (y-j, EpoJint). 



parameter 


log lo (0) 


unit 


Bmax 


±2.7126 


pM 


Epoo 


±3.3075 


pM 


k D 


±2.2148 


pM 




-1.7850 


1 /min 




-2.4977 


1/min 


k e 


-1.1259 


1 /min 


kex 


-2.0027 


1/min 


h 


-3.9790 


l/(pM-min) 


kt 


-1.4823 


1/min 


scale 


±2.2311 


cpm/pM 



Table 2. MAP estimates of the model parameters: The estimated values 6 were ob- 
tained by maximising the unnormalised posterior PDF by using the optimisation algorithm 
LSQNONLIN [5] from MATLAB for the extended experimental setup. 
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Figure 1. Results of the posterior profiling and MCMC sampling using the SIM algorithm 
for the initial setup: In contrast to the results obtained by the simplified MMALA algo- 
rithm shown in figure 4, the SIM algorithm converges too slowly for parameters affected 
by non-identifiability. The sampling results are of bad quality. 10 6 MCMC samples were 
generated and a thinning of 1/100 was used. The auto-correlation function of the samples 
still shows high correlation, see figure 2. This indicates that the process did not converge 
yet. The Markov chains for each parameter are displayed in figure 3. 
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Figure 2. ACF of the Markov chain using the SIM algorithm for the initial setup: The 
figure displays the auto-correlation function (ACF) of the generated MCMC samples for 
each parameter along the horizontal axis. A thinning of 1/100 was used to increase inde- 
pendence of the samples. A rapidly decaying ACF indicates independence of the samples 
and convergence of the Markov chain. The sample are still highly correlated. 
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Figure 3. Markov chains for the initial setup using the SIM algorithm: The figure displays 
the generated samples for each parameter after thinning along the horizontal axis. For 
parameters affected by non-identifiability, i.e. parameters in the upper and lower row, the 
Markov process did not converge after 10 4 samples were generated. In contrast to the 
results obtained by the simplified MMALA algorithm the SIM algorithm converges too 
slowly. 
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Figure 4. Results of the posterior profiling and MCMC sampling using the MMALA al- 
gorithm for the initial setup: For most parameters the results of profiling and MCMC 
sampling are similar though not identical. For parameter kt substantial difference are ob- 
served. 10 6 MCMC samples were generated. A thinning of 1/100 was. The independence 
of samples was checked by their auto-correlation function, see figure 5. This indicates that 
the process did converge. The Markov chains for each parameter are displayed in figure 6. 
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Figure 5. ACF of the Markov chain using the MMALA algorithm for the initial setup: 
The figure displays the auto-correlation function (ACF) of the generated MCMC samples 
for each parameter along the horizontal axis. A thinning of 1/100 was used to increase 
independence of the samples. A rapidly decaying ACF indicates independence of the sam- 
ples and convergence of the Markov chain. For parameter k on some dependency remained. 
However, the ACF drops to a value close to zero within a lag of 50 which is much smaller 
than the number of MCMC samples generated. 
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Figure 6. Markov chains using the MMALA algorithm for the initial setup: The figure 
displays the generated samples for each parameter after thinning along the horizontal 
axis. 
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Figure 7. Results of the posterior profiling and MCMC sampling using the SIM algorithm 
for the extended setup: In contrast to the results obtained by the simplified MMALA 
algorithm shown in figure 10, the SIM algorithm converges too slowly. Nevertheless, the 
agreement of profiles and MCMC samples is already acceptable. 10 6 MCMC samples 
were generated and a thinning of 1/100 was used. The auto-correlation function of the 
samples still shows high correlation, see figure 8. The Markov chains for each parameter 
are displayed in figure 9 and indicate that the process did not converge yet. The dashed 
red lines indicates the threshold A Q that can be used to assess likelihood based confidence 
intervals. 
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Figure 8. ACF of the Markov chain using the SIM algorithm for the extended setup: The 
figure displays the auto-correlation function (ACF) of the generated MCMC samples for 
each parameter along the horizontal axis. A thinning of 1/100 was used to increase inde- 
pendence of the samples. A rapidly decaying ACF indicates independence of the samples 
and convergence of the Markov chain. However the sample are still highly correlated. 
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Figure 9. Markov chains for the extended setup using the SIM algorithm: The figure 
displays the generated samples for each parameter after thinning along the horizontal 
axis. For some parameters the Markov process did not converge after 10 4 samples were 
generated. In contrast to the results obtained by the simplified MMALA algorithm the 
SIM algorithm converges too slowly. 
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Figure 10. Results of the posterior profiling and MCMC sampling using the MMALA 
algorithm for the extended setup: the results of profiling and MCMC sampling are in 
good agreement. 10 MCMC samples were generated. A thinning of 1/10 was used. The 
independence of samples was checked by their auto-correlation function, see figure 11. 
This indicates that the process did converge. The Markov chains for each parameter are 
displayed in figure 12. The dashed red lines indicates the threshold A a that can be used 
to assess likelihood based confidence intervals. 
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Figure 11. ACF of the Markov chain using the MMALA algorithm for the extended setup: 
The figure displays the auto-correlation function (ACF) of the generated MCMC samples 
for each parameter along the horizontal axis. A thinning of 1/10 was used to increase inde- 
pendence of the samples. A rapidly decaying ACF indicates independence of the samples 
and convergence of the Markov chain. 
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Figure 12. Markov chains using the MMALA algorithm for the extended setup: The figure 
displays the generated samples for each parameter after thinning along the horizontal axis. 
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Figure 13. Dependency structure of posterior samples: The results of the MCMC sam- 
pling allow for considering the full high dimensional posterior distribution. For example, 
non-linear relationship between parameter kdi and kd e are taken into account. 



Article to appear in Phil. Trans. Roy. Soc. A 



